Implement stateful surpasses() for until()#7745
Implement stateful surpasses() for until()#7745Manishearth merged 12 commits intounicode-org:mainfrom
Conversation
885f70e to
a5e6c00
Compare
Manishearth
left a comment
There was a problem hiding this comment.
Waiting on review until Shane's suggestion in #7739 (comment) is implemented
|
what are the performance numbers here with #7739 as the baseline? |
These numbers are for the current, overly complicated implementation. I filtered out the week/day stuff which isn't relevant since those are handled by the (unchanged) fast RD path. |
|
I forgot to mention, I reverted the This is not finished, but submitted now to see if this approach looks good to folks. TODO next:
|
c8e9585 to
e5e08cc
Compare
|
@sffc (or anyone else for whom this has an obvious answer): After this change, Should the original (non-stateful) implementation just be deleted? |
|
It might be useful to have it stick around so we can prove that it has the same behavior, but aside from that it seems ok to delete. I haven't looked at the code yet, let's wait for Shane to look. |
|
@Manishearth Just saw PR #7753, nice. I will merge and retest and I will be done with this, pending review feedback. |
…eSurpasses spec
This is not ready to go, but I wanted to get some feedback.
Most of the following will not make sense until one has seen the code.
- The priority here is to stick to the spec, and avoid recalculation.
There are some clumsy aspects, like the need for the caller to
invoke the checker in a specific way:
1. Large fields to small fields.
2. Before moving from field A to field B, make a final
call with the computed value of field A, to set the internal
state correctly.
3. Caller maintains their own y/m/w/d.
I tried to work around unicode-org#2 and unicode-org#3, but couldn't come up with
anything clean. It seems like the current approach is a good
compromise.
- For the debug assertions, clippy complained that I was calling
a mut function. This is actually ok, because the object
is reset to a new state immediately afterwards. The workaround
to the clippy warning is expensive, but should be ok because it's
just for debug. I tried other approaches but didn't get anything
else to work. Adding an "allow" directive didn't work because
debug_assert! is a macro.
- There is now no in-library usage of surpasses(). All the testing
for that was via until() as far as I can tell. The work item is
to add test coverage for surpasses(), and once that looks good,
I will refactor surpasses() to use the stateful checker, to
optimize performance and eliminate duplicate code.
8ebf5ed to
2cb4975
Compare
sffc
left a comment
There was a problem hiding this comment.
Yes, this is very close to what I had in mind!
| let surpasses_years = ArithmeticDate::<C>::compare_surpasses_lexicographic( | ||
| self.sign, | ||
| self.y0, | ||
| base_month, | ||
| self.parts.day(), | ||
| self.cal_date_2, | ||
| self.cal, | ||
| ); |
There was a problem hiding this comment.
Nit: This doesn't use the results of the m0 calculations, and it implements step 4 in the spec. Move this up to the comment about step 4, and move constrain and m0 to step 5.
| // ordinal month 6), a lexicographical-only check fails to detect that a later | ||
| // day in Adar I actually surpasses an earlier day in Adar. This ordinal check | ||
| // ensures the year counter correctly stays at 0 instead of over-incrementing to 1. | ||
| surpasses_years || self.surpasses_months(0) |
There was a problem hiding this comment.
Thanks for the comment. It correctly justifies why you need to do this in terms of calendar logic. If you prefer, the following explanation is shorter and also correct.
If the lexicographic check returns false, the spec continues to the ordinal check, which is implemented in
surpasses_months.
| self.end_of_month.day | ||
| }; | ||
|
|
||
| surpasses |
There was a problem hiding this comment.
Suggestion: add a comment like
// We do not need to perform any other checks because step 8 of the spec guarantees an early return if weeks and days are 0.
| self.cal_date_2, | ||
| ); | ||
|
|
||
| // 9. Let endOfMonth be BalanceNonISODate(calendar, monthsAdded.[[Year]], monthsAdded.[[Month]] + 1, 0). |
There was a problem hiding this comment.
Suggestion (optional): consider moving steps 9-12 to a function set_month which you call up above in the until impl only once when you need it. It seems like you can get a little bit more performance if you avoid calling new_balanced (which is expensive) until you need it for weeks and days.
You could also move step 5 into a new fn set_year but that's less impactful because we need to run that code each time for reasons explained in the 8-line comment you wrote.
There was a problem hiding this comment.
Done in commit 7dc6469
I didn't do the same in set_year because, not only is it less impactful, it's also a little messier. We'd need another state variable for base_month, or we'd have to copy-paste the code from surpasses_year which is a little ugly.
Co-authored-by: Shane F. Carr <shane@unicode.org>
- Move some code around to more closely match the spec - Improve performance of month iteration by minimizing calls to `new_balanced` - Add explicit `set_<field>` methods to `SurpassesChecker`, to be clearer semantically. This is a place to "freeze" the internal state of the checker for a given field value. These usually inline to call `surpasses_<field>`, but in the case of months, they do the expensive `new_balanced` call. Future optimizations can move more stuff into `set_<field>`.
|
Can I get a quick review on f58a9da ( UPDATE: Never mind, it's supposed to go in the PR description, let me fix that |
|
@sffc Could you push the merge (squash) button? Looks like it's good to go. |
Improve performance of until() by introducing a new, stateful implementation of surpasses() that minimizes recalculation in field iteration loops.
Changelog
icu_calendar: Speed up
untilyear and month field handling by 75% on average by optimizingsurpassescalculation